An E cient Approach for Approximating Multi - dimensional RangeQueries and Nearest Neighbor Classi cation in Large

نویسندگان

  • Carlotta Domeniconi
  • Dimitrios Gunopulos
چکیده

We propose a locally adaptive technique to address the problem of setting the bandwidth parameters optimally for kernel density estimation. Our technique is eecient and can be performed in only two dataset passes. We also show how to apply our technique to eeciently solve range query approximation, classiication and clustering problems for very large datasets. We validate the eeciency and accuracy of our technique by presenting experimental results on a variety of both synthetic and real datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of k Nearest Neighbor on Feature Projections Classi er to Text Categorization

This paper presents the results of the application of an instance based learning algorithm k Nearest Neighbor Method on Fea ture Projections k NNFP to text categorization and compares it with k Nearest Neighbor Classi er k NN k NNFP is similar to k NN ex cept it nds the nearest neighbors according to each feature separately Then it combines these predictions using a majority voting This prop er...

متن کامل

Metrics and Models for Handwritten Character Recognition

A digitized handwritten numeral can be represented as a binary or greyscale image. An important pattern recognition task that has received much attention lately is to automatically determine the digit, given the image. While many di erent techniques have been pushed very hard to solve this task, the most successful and intuitively appropriate is due to Simard (Simard, LeCun & Denker 1993). Thei...

متن کامل

Symbolic Nearest Mean Classiiers

Piew Datta and Dennis Kibler Department of Information and Computer Science University of California Irvine, CA 92717 fpdatta, [email protected] Abstract The minimum-distance classi er summarizes each class with a prototype and then uses a nearest neighbor approach for classi cation. Three drawbacks of the minimum-distance classi er are its inability to work with symbolic attributes, weigh at...

متن کامل

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

EFFECT OF THE NEXT-NEAREST NEIGHBOR INTERACTION ON THE ORDER-DISORDER PHASE TRANSITION

In this work, one and two-dimensional lattices are studied theoretically by a statistical mechanical approach. The nearest and next-nearest neighbor interactions are both taken into account, and the approximate thermodynamic properties of the lattices are calculated. The results of our calculations show that: (1) even though the next-nearest neighbor interaction may have an insignificant ef...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001